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Abstract — The sparse signal processing literature often uses 
random sensing matrices to obtain performance guarantees. 
Unfortunately, in the real world, sensing matrices do not always 
come from random processes. It is therefore desirable to evaluate 
whether an arbitrary matrix, or frame, is suitable for sensing 
sparse signals. To this end, the present paper investigates two 
parameters that measure the coherence of a frame: worst-case 
and average coherence. We first provide several examples of 
frames that have small spectral norm, worst-case coherence, 
and average coherence. Next, we present a new lower bound on 
worst-case coherence and compare it to the Welch bound. Later, 
we propose an algorithm that decreases the average coherence 
of a frame without changing its spectral norm or worst-case 
coherence. Finally, we use worst-case and average coherence, as 
opposed to the Restricted Isometry Property, to garner near- 
optimal probabilistic guarantees on both sparse signal detection 
and reconstruction in the presence of noise. This contrasts with 
recent results that only guarantee noiseless signal recovery from 
arbitrary frames, and which further assume independence across 
the nonzero entries of the signal — in a sense, requiring small 
average coherence replaces the need for such an assumption. 

I. Introduction 

Many classical applications, such as radar and error- 
correcting codes, make use of over-complete spanning systems 
Oftentimes, we may view an over-complete spanning 
system as a frame. Take F — to be a collection of 

vectors in some separable Hilbert space %. Then is a frame 
if there exist frame bounds A and B with < A < B < oo 
such that AllxW^ < K^^' /»>l^ ^ ^W^W^ for ^^^ry 

X ^ T-L. When A = B, F \s, called a tight frame. For finite- 
dimensional unit norm frames, where I ~ {1, . . . , N}, the 
worst-case coherence is a useful parameter: 



HF ■■= max 

i,je{i,....N} 



(1) 



Note that orthonormal bases are tight frames with A — B ~ 1 
and have zero worst-case coherence. In both ways, frames 
form a natural generalization of orthonormal bases. 

In this paper, we only consider finite-dimensional frames. 
Those not familiar with frame theory can simply view a finite- 
dimensional frame as an M x matrix of rank M whose 
columns are the frame elements. With this view, the tightness 
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condition is equivalent to having the spectral norm be as small 
as possible; for an AI x N unit norm frame F, this equivalently 
means ||F||^ = §. 

Throughout the literature, applications require finite- 
dimensional frames that are nearly tight and have small worst- 
case coherence |[l)-|[8|. Among these, a foremost application 
is sparse signal processing, where frames of small spectral 
norm and/or small worst-case coherence are commonly used 
to analyze sparse signals ||4)-([8). Recently, Q introduced 
another notion of frame coherence called average coherence: 



1 

JV-l 



max 

G{l,...,Ar} 



N 

E 



(2) 



Note that, in addition to having zero worst-case coherence, 
orthonormal bases also have zero average coherence. It was 
established in 19J that when vp is sufficiently smaller than 
/ii?, a number of guarantees can be provided for sparse signal 
processing. It is therefore evident from |T|-f9l that there is 
a pressing need for nearly tight frames with small worst-case 
and average coherence, especially in the area of sparse signal 
processing. 

This paper offers four main contributions in this regard. 
First, we discuss three types of frames that exhibit small spec- 
tral norm, worst-case coherence, and average coherence: nor- 
malized Gaussian, random harmonic, and code-based frames. 
With all three frame parameters provably small, these frames 
are guaranteed to perform well in relevant applications. Sec- 
ond, performance in many applications is dictated by worst- 
case coherence |[l)-|[8). It is therefore particularly important to 
understand which worst-case coherence values are achievable. 
To this end, the Welch bound |1| is commonly used in the 
literature. However, the Welch bound is only tight when the 
number of frame elements A^ is less than the square of the 
spatial dimension M |1 1. Another lower bound, given in 1 10| 
and pTj , beats the Welch bound when there are more frame 
elements, but it is known to be loose for real frames fT2l. 
Given this context, our next contribution is a new lower bound 
on the worst-case coherence of real frames. Our bound beats 
both the Welch bound and the bound in [101 ™d 1 1 1 1 when the 
number of frame elements far exceeds the spatial dimension. 
Third, since average coherence is new to the frame theory 
literature, we investigate how it relates to worst-case coherence 
and spectral norm. In particular, we want average coherence to 
satisfy the following property, which is used in |9| to provide 
various guarantees for sparse signal processing: 



Deflnition 1. We say an Af x N unit norm frame F satisfies 
the Strong Coherence Property if 



(SCP-1) < 



and 



(SCP-2) yF<^,. 



164 log N 

where jip and vp are given by ([T]) and (|2]i, respectively. 

Since average coherence is so new, there is currently no in- 
tuition as to when (SCP-2) is satisfied. As a third contribution, 
this paper shows how to transform a frame that satisfies (SCP- 
1) into another frame with the same spectral norm and worst- 
case coherence that additionally satisfies (SCP-2). Finally, this 
paper uses the Strong Coherence Property to provide new 
guarantees on both sparse signal detection and reconstruction 
in the presence of noise. These guarantees are related to those 
in Q, ||5), I?], and we elaborate on this relationship in Section 
V. In the interest of space, the proofs have been omitted 
throughout, but they can be found in |13j . 

II. Frame constructions 

Many applications require nearly tight frames with small 
worst-case and average coherence. In this section, we give 
three types of frames that satisfy these conditions. 

A. Normalized Gaussian frames 

Construct a matrix with independent, Gaussian distributed 
entries that have zero mean and unit variance. By normalizing 
the columns, we get a matrix called a normalized Gaussian 
frame. This is perhaps the most widely studied type of frame 
in the signal processing and statistics literature. 

To be clear, the term "normalized" is intended to distinguish 
the results presented here from results reported in earlier 
works, such as Q, |fT4|-p6|, which only ensure that the frame 
elements of Gaussian frames have unit norm in expectation. 
In other words, normalized Gaussian frames are frames with 
individual frame elements independently and uniformly dis- 
tributed on the unit hypersphere in M*^. 

That said, the following theorem characterizes the spectral 
norm and the worst-case and average coherence of normalized 
Gaussian frames. 

Theorem 2 (Geometry of normalized Gaussian frames). Build 
a real M x N frame G by drawing entries independently 
at random from a Gaussian distribution of zero mean and 
unit variance. Next, construct a normalized Gaussian frame 
F by taking fn '.— jj^^ for every n = 1,. . . ,N. Provided 
60 log TV < M < 4^^" jy , then the following inequalities si- 
multaneously hold with probability exceeding 1 — llA^^^.- 

(i) A*F < 



yisiogiv 



(ii) i^F < 



/M -V121og Af' 

yisiog jv 

Af-V12Mlog7V' 



(iii) I|i^ll2< 



/M+VN+^/2logN 
y'M-y/8AI\ogN 



B. Random harmonic frames 

Random harmonic frames, constructed by randomly select- 
ing rows of a discrete Fourier transform (DFT) matrix and 
normalizing the resulting columns, have received considerable 
attention lately in the compressed sensing literature p7)-p9). 



However, to the best of our knowledge, there is no result in the 
literature that shows that random harmonic frames have small 
worst-case coherence. To fill this gap, the following theorem 
characterizes the spectral norm and the worst-case and average 
coherence of random harmonic frames. 

Theorem 3 (Geometry of random harmonic frames). Let U be 

an N X N non-normalized discrete Fourier transform matrix, 
explicitly Uu := e^'^"^^/^ for each fc, £ = 0, . . . , iV - 1. Next, 
let {Bi]f^-y be a collection of independent Bernoulli random 



and take M. :— {i : Bi — 1}. 
Finally, construct an \M\ x N harmonic frame F by col- 
lecting rows of U which correspond to indices in Ai and 
normalize the columns. Then F is a unit norm tight frame: 
\\F\W ~ Furthermore, provided 16 log < M < ^, the 
following inequalities simultaneously hold with probability 
exceeding 1 - AN~^ - N~^: 

(i) 5M < \M\ < f M, 



variables with mean 




C. Code-based frames 

Many structures in coding theory are also useful for con- 
structing frames. Here, we build frames from a code that 
originally emerged with Berlekamp in | ,20J , and found recent 
reincarnation with ||2l|l. We build a 2"^ x 2(*+i)'" frame, 
indexing rows by elements of F2™ and indexing columns 
by {t + l)-tuples of elements from ¥2'^. For x G F2™ and 
a e Fjm^, the corresponding entry of the matrix F is 



F^ 



(-1) 



(3) 



where Tr : F2™ — > F2 denotes the trace map, defined by 



Tr(z) — ■ The following theorem gives the spectral 

norm and the worst-case and average coherence of this frame. 

Theorem 4 (Geometry of code-based frames). The 2™ x 

2(t+i)»n jfijffig defined by Q is unit norm and tight, i.e.. 



2 , with worst-case coherence Hf 1^ 



.\F\\l 

average coherence vp < 



and 



III. Fundamental limits on worst-case coherence 

In many applications of frames, performance is dictated by 
worst-case coherence. It is therefore particularly important to 
understand which worst-case coherence values are achievable. 
To this end, the following bound is commonly used in the 
literature: 

Theorem 5 (Welch bound ||T|). Every MxN unit norm frame 



F has worst-case coherence Hp > 



N-M 
M{N-1)- 



The Welch bound is not tight whenever N > (ij. For 
this region, the following gives a better bound: 



Theorem 6 (|10|, pTj). Every MxN unit norm frame F 
has worst-case coherence ^p > 1 — 2N^^/^^'^^^\ Taking 
N = Q{a^^), this lower bound goes to 1 — - as M 00. 



Algorithm 1 Linear-time flipping 
''' ''' wT«T------j - lupyt: An M X N unit norm frame F 

-** J Output: An Af X unit norm frame G that is flipping 

equivalent to F 

gi ^ fi {Keep first frame element} 

for n = 2 to A^ do 

if II E-=i 3. + /nil < II E:17 5. - /nil then 

9n ^ fn {Keep frame element for shorter sum} 
else 

gn < fn {Flip frame element for shorter sum} 

end if 
end for 




Fig. 1. Different bounds on worst-case coiierence for M = 3, N = 
3, . . . , 55. Stars give numerically determined optimal worst-case coherence 
of A'^ real unit vectors, found in [12]. Dotted curve gives Welch bound, dash- 
dotted curve gives bound from Theorem [6] dashed curve gives general bound 
from Theorem |7] and solid curve gives bound from Theorem |8] 



For many applications, it does not make sense to use a 
complex frame, but the bound in Theorem [6] is known to be 
loose for real frames p2| . We therefore improve Theorem |6] 
for the case of real unit norm frames: 

Theorem 7. Every real M x A^ unit norm frame F has worst- 
case coherence 



Hf > cos 



M-l 



-1 r(^ 



r(f ) 



(4) 



Furthermore, taking N = <d{a^^), this lower bound goes to 
cos(-) as M ^ oo. 

In p2| , numerical results are given for M — 3, and we 
compare these results to Theorems [6] and [7] in Figure T] 
Considering this figure, we note that the bound in Theorem l6 is 
inferior to the maximum of the Welch bound and the bound in 
Theorem [7j at least when AI = 3. This illustrates the degree 
to which Theorem |7] improves the bound in Theorem |6] for 
real frames. In fact, since cos(-) > 1 — - for all a > 2, the 
bound for real frames in Theorem |7] is asymptotically better 
than the bound for complex frames in Theorem |6] Moreover, 
for M = 2, Theorem I?] says /i > cos(;^), and [221 proved this 
bound to be tight for every N > 2. For AI = 3, Theorem |7] 
can be further improved as follows: 

Theorem 8. Every real 3 x N unit norm frame F has worst- 
case coherence fJ-F ^ ^ — jj + 

IV. Reducing average coherence 

In Q, average coherence is used to garner a number of 
guarantees on sparse signal processing. Since average coher- 
ence is so new to the frame theory literature, this section 
will investigate how average coherence relates to worst-case 
coherence and the spectral norm. We start with a definition: 

Definition 9 (Wiggling and flipping equivalent frames). We 
say the frames F and G are wiggling equivalent if there exists 
a diagonal matrix D of unimodular entries such that G — FD. 



Furthermore, they are flipping equivalent if D is real, having 
only ±l's on the diagonal. 

The terms "wiggling" and "flipping" are inspired by the 
fact that individual frame elements of such equivalent frames 
are related by simple unitary operations. Note that every 
frame with A^ nonzero frame elements belongs to a flipping 
equivalence class of size 2^, while being wiggling equivalent 
to uncountably many frames. The importance of this type of 
frame equivalence is, in part, due to the following lemma, 
which characterizes the shared geometry of wiggling equiva- 
lent frames: 

Lemma 10 (Geometry of wiggling equivalent frames). Wig- 
gling equivalence preserves the norms of frame elements, the 
worst-case coherence, and the spectral norm. 

Now that we understand wiggling and flipping equivalence, 
we are ready for the main idea behind this section. Suppose we 
are given a unit norm frame with acceptable spectral norm and 
worst-case coherence, but we also want the average coherence 
to satisfy (SCP-2). Then by Lemma [TO] all of the wiggling 
equivalent frames will also have acceptable spectral norm and 
worst-case coherence, and so it is reasonable to check these 
frames for good average coherence. In fact, the following 
theorem guarantees that at least one of the flipping equivalent 
frames will have good average coherence, with only modest 
requirements on the original frame's redundancy. 

Theorem 11 (Frames with low average coherence). Let F be 
an M X N unit norm frame with AI < , !^~},r - Then there 

^ 4 log AN 

exists a frame G that is flipping equivalent to F and satisfies 
vn < 



While Theorem 11 guarantees the existence of a flipping 



equivalent frame with good average coherence, the result does 
not describe how to find it. Certainly, one could check all 2^ 
frames in the flipping equivalence class, but such a procedure 
is computationally slow. As an alternative, we propose a linear- 
time flipping algorithm (Algorithm [TJ. The following theorem 
guarantees that linear-time flipping will produce a frame with 
good average coherence, but it requires the original frame's 
redundancy to be higher than what suffices in Theorem [TT| 

Theorem 12. Suppose N > +3M + 3. Then Algorithm^ 



outputs an M x N frame G that is flipping equivalent to F 
and satisfies va < -^7=- 

As an example of how linear-time flipping improves average 
coherence, consider the following matrix: 



F 



+ 



+ 



+ 



^ - + + - 
Here, i^p ~ 0.3778 > 0.2683 



Even though N < AP + 
3M + 3, we can run linear-time flipping to get the flipping 

pattern D diag(H 1 hH Then FD 

has average coherence h'pn ~ 0.1556 < A= = ^^t=-. This 

° ^ ^ ^/M „ VM 

MI + 3 



example illustrates that the condition N > M 



in Theorem 12 is sufficient but not necessary. 



V. Near-optimal sparse signal processing without 
THE Restricted Isometry Property 

Frames with small spectral norm, worst-case coherence, 
and/or average coherence have found use in recent years with 
applications involving sparse signals. Donoho et al. used the 
worst-case coherence in |j5] to provide uniform bounds on 
the signal and support recovery performance of combinatorial 
and convex optimization methods and greedy algorithms. 
Later, Tropp |7| and Candes and Plan [4] used both the 
spectral norm and worst-case coherence to provide tighter 
bounds on the signal and support recovery performance of 
convex optimization methods for most support sets under the 
additional assumption that the sparse signals have independent 
nonzero entries with zero median. Recently, Bajwa et al. Q 
made use of the spectral norm and both coherence parameters 
to report tighter bounds on the noisy model selection and 
noiseless signal recovery performance of an incredibly fast 
greedy algorithm called one-step thresholding ( OST) for most 
support sets and arbitrary nonzero entries. In this section, we 
discuss further implications of the spectral norm and worst- 
case and average coherence of frames in applications involving 
sparse signals. 

A. The Weak Restricted Isometry Property 

A common task in signal processing applications is to test 
whether a collection of measurements corresponds to mere 
noise |23) . For applications involving sparse signals, one can 
test measurements y e C*^ against the null hypothsis Hq : 
y = e and alternative hypothesis Hi : y ^ Fx + e, where the 
entries of the noise vector e e C^^ are independent, identical 
zero-mean complex-Gaussian random variables and the signal 
X G is i^-sparse. The performance of such signal detection 
problems is directly proportional to the energy in Fx p3)- 
p5| . In particular, existing literature on the detection of sparse 
signals |[24j, ||25j leverages the fact that ||i^a:;||^ « ||a;|p when 
F satisfies the Restricted Isometry Property (RIP) of order K. 
In contrast, we now show that the Strong Coherence Property 



Definition 13. We say an M x iV frame F satisfies the 
{K, 5,p)-Weak Restricted Isometry Property (Weak RIP) if for 
every X-sparse vector x € C^, a random permutation y of 
x's entries satisfies 



(1- '5)112/11 



< 



\Fyr<{l + 6)\\y\\ 



also guarantees 

start with a definition: 



for most iiT-sparse vectors. We 



with probability exceeding I — p. 

We note the distinction between RIP and Weak RIP — 
Weak RIP requires that F preserves the energy of most sparse 
vectors. Moreover, the manner in which we quantify "most" 
is important. For each sparse vector, F preserves the energy 
of most permutations of that vector, but for different sparse 
vectors, F might not preserve the energy of permutations 
with the same support. That is, unlike RIP, Weak RIP is 
not a statement about the singular values of submatrices 
of F. Certainly, matrices for which most submatrices are 
well-conditioned, such as those discussed in ||7), will satisfy 
Weak RIP, but Weak RIP does not require this. That said, 
the following theorem shows, in part, the significance of the 
Strong Coherence Property. 

Theorem 14. Any M x N unit norm frame F that satisfies 
the Strong Coherence Property also satisfies the {K,S, j^)- 
Weak Restricted Isometry Property provided N > 128 and 
2/aog7V<min{^,M}. 

B. Reconstruction of sparse signals from noisy measurements 

Another common task in signal processing applications is to 
reconstruct a X-sparse signal x E from a small collection 
of linear measurements y e C^^. Recently, Tropp [Jl used 
both the worst-case coherence and spectral norm of frames to 
find bounds on the reconstruction performance of basis pursuit 
(BP) [26] for most support sets under the assumption that the 
nonzero entries of x are independent with zero median. In 
contrast, ||9| used the spectral norm and worst-case and average 
coherence of frames to find bounds on the reconstruction 
performance of OST for most support sets and arbitrary 
nonzero entries. However, both fl] and ^ limit themselves 
to recovering x in the absence of noise, corresponding to 
y = Fx, a rather ideal scenario. 

Our goal in this section is to provide guarantees for the 
reconstruction of sparse signals from noisy measurements 
y — Fx + e, where the entries of the noise vector e G C*^ 
are independent, identical complex-Gaussian random variables 
with mean zero and variance cr^. In particular, and in contrast 
with Q, our guarantees will hold for arbitrary frames F 
without requiring the signal's sparsity level to satisfy K — 
0{^'p^). The reconstruction algorithm that we analyze here is 
the OST algorithm of |9|, which is described in Algorithm [2] 
The following theorem extends the analysis of |9| and shows 
that the OST algorithm leads to near-optimal reconstruction 
error for large classes of sparse signals. 

Before proceeding further, we first define some notation. 
We use SNR := ||x||^/E[||e|p] to denote the signal-to- 
noise ratio associated with the signal reconstruction problem. 
Also, we use 7^(t) := {n : \xn\ > Y^y/2aHogN} for any 



Algorithm 2 One-Step Thresholding (OST) f9l 

Input: An M x N unit norm frame F, a vector y — Fx + e, 

and a threshold A > 

Output: An estimate IC C {1, . . . , N} of the support of x and 

an estimate x £ of x 

X ^ {Initialize} 
z ^ F*y {Form signal proxy} 

IC ^ {n : \zn\ > A} {Select indices via OST} 

^ (-^K.)^y {Reconstruct signal via least-squares} 



t E (0, 1) to denote the locations of all the entries of x that, 
roughly speaking, lie above the noise floor a. Finally, we 



use 7^(t) := {r 



> 



MF||a;||-\/2 log iV} to denote the 



locations of entries of x that, roughly speaking, lie above the 
self-interference floor /i^^ 1 1 a; 1 1 . 

Theorem 15 (Reconstruction of sparse signals). Take 
an M X N unit norm frame F which satisfies the 
Strong Coherence Property, pick t € (0, 1), and choose 
A = ^2a-2 log N maxj^^F^M SNR, f^}. Further, sup- 
pose X e has support K, drawn uniformly at random from 
all possible K -subsets of {1, ... ^ N}. Then provided 



K < 



N 



cfllFll^logAT' 



(5) 



Algorithm^produces IC such that 7^(t) H 7^(t) C IC C /C 
and X such that 



\\x-x\\<C2^JcT^\IC\logN + C3\\x^^^\\ 

with probability exceeding 1 — lOiV^^. Finally, defining T 
|7^(i) n 7^(i)|, we further have 

\\x - x\\ < C2\/(T'^K\ogN + callx - xt\ 
in the same probability event. Here, ci = 37e, C2 = 



(6) 



(7) 



and C3 = 1 



-1/2 



are numerical constants. 



1-0-1/2' 



l-C-1/2 

A few remarks are in order now for Theorem [15] First, if 
F satisfies the Strong Coherence Property and F is nearly 
tight, then OST handles sparsity that is almost linear in M: 
K = 0{M/ log N) from (|5]l. Second, the £2 error associated 
with the OST algorithm is the near-optimal (modulo the log 
factor) error of y^a^KlogN plus the best T-term approxima- 
tion error caused by the inability of the OST algorithm to re- 
cover signal entries that are smaller than 0(/ii7||a;|| v^2TogiV). 
Nevertheless, it is easy to convince oneself that such error is 
still near-optimal for large classes of sparse signals. Consider, 
for example, the case where ~ 0{1/Vm), the magnitudes 
of K/2 nonzero entries of x are some a — cr^ log TV), 
while the magnitudes of the other K/2 nonzero entries are not 
necessarily same but scale as 0{\/ log N). Then we have 



from Theorem 



15 



that 



xt\\ = Oiy/g^Kl ogN), whic h 
leads to near-optimal £2 error of ||a;-.T|| = 0{^/c7^KlogN). 
To the best of our knowledge, this is the first result in the 
sparse signal processing literature that does not require RIP 
and still provides near-optimal reconstruction guarantees for 



such signals in the presence of noise, while using either ran- 
dom or deterministic frames, even when K = 0{M / log N). 
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